communal violence
How Effectively Can BERT Models Interpret Context and Detect Bengali Communal Violent Text?
Khondoker, Abdullah, Taufik, Enam Ahmed, Tashik, Md. Iftekhar Islam, Mahmud, S M Ishtiak, Sadeque, Farig
The spread of cyber hatred has led to communal violence, fueling aggression and conflicts between various religious, ethnic, and social groups, posing a significant threat to social harmony. Despite its critical importance, the classification of communal violent text remains an underexplored area in existing research. This study aims to enhance the accuracy of detecting text that incites communal violence, focusing specifically on Bengali textual data sourced from social media platforms. We introduce a fine-tuned BanglaBERT model tailored for this task, achieving a macro F1 score of 0.60. To address the issue of data imbalance, our dataset was expanded by adding 1,794 instances, which facilitated the development and evaluation of a fine-tuned ensemble model. This ensemble model demonstrated an improved performance, achieving a macro F1 score of 0.63, thus highlighting its effectiveness in this domain. In addition to quantitative performance metrics, qualitative analysis revealed instances where the models struggled with context understanding, leading to occasional misclassifications, even when predictions were made with high confidence. Through analyzing the cosine similarity between words, we identified certain limitations in the pre-trained BanglaBERT models, particularly in their ability to distinguish between closely related communal and non-communal terms. To further interpret the model's decisions, we applied LIME, which helped to uncover specific areas where the model struggled in understanding context, contributing to errors in classification. These findings highlight the promise of NLP and interpretability tools in reducing online communal violence. Our work contributes to the growing body of research in communal violence detection and offers a foundation for future studies aiming to refine these techniques for better accuracy and societal impact.
- Asia > Myanmar (0.05)
- Asia > Middle East > Israel (0.04)
- Asia > China (0.04)
- (8 more...)
Mapping Violence: Developing an Extensive Framework to Build a Bangla Sectarian Expression Dataset from Social Media Interactions
Tasnim, Nazia, Gupta, Sujan Sen, Shihab, Md. Istiak Hossain, Juee, Fatiha Islam, Tahsin, Arunima, Ghum, Pritom, Fatema, Kanij, Haque, Marshia, Farzana, Wasema, Nasir, Prionti, KhudaBukhsh, Ashique, Sadeque, Farig, Sushmit, Asif
Communal violence in online forums has become extremely prevalent in South Asia, where many communities of different cultures coexist and share resources. These societies exhibit a phenomenon characterized by strong bonds within their own groups and animosity towards others, leading to conflicts that frequently escalate into violent confrontations. To address this issue, we have developed the first comprehensive framework for the automatic detection of communal violence markers in online Bangla content accompanying the largest collection (13K raw sentences) of social media interactions that fall under the definition of four major violence class and their 16 coarse expressions. Our workflow introduces a 7-step expert annotation process incorporating insights from social scientists, linguists, and psychologists. By presenting data statistics and benchmarking performance using this dataset, we have determined that, aside from the category of Non-communal violence, Religio-communal violence is particularly pervasive in Bangla text. Moreover, we have substantiated the effectiveness of fine-tuning language models in identifying violent comments by conducting preliminary benchmarking on the state-of-the-art Bangla deep learning model.
- Asia > Myanmar (0.04)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (16 more...)
- Media > News (1.00)
- Law (1.00)
- Health & Medicine (1.00)
- (2 more...)